-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
BUG: Fix Categorical displays string categories without quotes when dtype is "string" #63070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
BUG: Fix Categorical displays string categories without quotes when dtype is "string" #63070
Conversation
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
doc/source/whatsnew/v3.0.0.rst
Outdated
| Categorical | ||
| ^^^^^^^^^^^ | ||
| - Bug in :class:`Categorical` where constructing from a pandas :class:`Series` or :class:`Index` with ``dtype='object'`` did not preserve the categories' dtype as ``object``; now the ``categories.dtype`` is preserved as ``object`` for these cases, while numpy arrays and Python sequences with ``dtype='object'`` continue to infer the most specific dtype (for example, ``str`` if all elements are strings) (:issue:`61778`) | ||
| - Bug in :class:`pandas.Categorical` displaying string categories without quotes when constructed from a Series with dtype "string" (:issue:`63045`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Bug in :class:`pandas.Categorical` displaying string categories without quotes when constructed from a Series with dtype "string" (:issue:`63045`) | |
| - Bug in :class:`pandas.Categorical` displaying string categories without quotes when using "string" dtype (:issue:`63045`) |
It is not so much the issue that the Categorical was created from a Series, but that it is using the string dtype for its categories (you can construct the same categorical in other ways as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your review. I have updated this doc accordingly.
| expected = "[1, '2', 3, 4]\nCategories (4, object): [1, 3, 4, '2']" | ||
| assert result == expected | ||
|
|
||
| def test_categorical_with_pandas_series(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def test_categorical_with_pandas_series(self): | |
| def test_categorical_with_string_dtype(self): |
| def test_categorical_with_pandas_series(self): | ||
| # GH 63045 | ||
| s = Series(["apple", "banana", "cherry", "cherry"], dtype="string") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def test_categorical_with_pandas_series(self): | |
| # GH 63045 | |
| s = Series(["apple", "banana", "cherry", "cherry"], dtype="string") | |
| def test_categorical_with_pandas_series(self, string_dtype_no_object): | |
| # GH 63045 | |
| s = Series(["apple", "banana", "cherry", "cherry"], dtype=string_dtype_no_object) |
You could maybe use here this fixture that will test it for the different string dtype variations, to make sure we now do this consistently for all string like dtypes.
The only thing you will have to update is the "string" in the expected result below (you can include str(string_dtype_no_object) in the expected value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your review. I have updated this test accordingly.
|
Some tests failed, but I don’t think it’s because of my latest commit since it only changed the test case and doc. I also ran the failed tests locally and got xfail results. |
doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.Fix display of string categories without quotes as reported in Issue #63045 by adding a check for categories with dtype 'string'.